* Core Prep and Integration File

set more off

clear

cap n log close

log using 2a1-Prep-Core.log, replace

*******************
**Industry recode**
*******************

use ./temp/1a-US-Master.dta, clear
drop if ind1990>=900 | ind1990==.
gen I3=ind1990
gen Irev=ind1990
replace Irev=10 if I3==10
replace Irev=11 if I3==11
replace Irev=11 if I3==12
replace Irev=20 if I3==20
replace Irev=20 if I3==30
replace Irev=30 if I3==31
replace Irev=30 if I3==32
replace Irev=40 if I3==40
replace Irev=40 if I3==41
replace Irev=40 if I3==42
replace Irev=40 if I3==50
replace Irev=60 if I3==60
replace Irev=100 if I3==100
replace Irev=100 if I3==101
replace Irev=100 if I3==102
replace Irev=100 if I3==110
replace Irev=100 if I3==111
replace Irev=100 if I3==112
replace Irev=100 if I3==120
replace Irev=100 if I3==121
replace Irev=100 if I3==122
replace Irev=100 if I3==130
replace Irev=132 if I3==132
replace Irev=132 if I3==140
replace Irev=132 if I3==141
replace Irev=132 if I3==142
replace Irev=132 if I3==150
replace Irev=150 if I3==151
replace Irev=150 if I3==152
replace Irev=160 if I3==160
replace Irev=160 if I3==161
replace Irev=160 if I3==162
replace Irev=170 if I3==171
replace Irev=170 if I3==172
replace Irev=180 if I3==180
replace Irev=180 if I3==181
replace Irev=180 if I3==182
replace Irev=180 if I3==190
replace Irev=180 if I3==191
replace Irev=180 if I3==192
replace Irev=200 if I3==200
replace Irev=200 if I3==201
replace Irev=210 if I3==210
replace Irev=210 if I3==211
replace Irev=210 if I3==212
replace Irev=220 if I3==220
replace Irev=220 if I3==221
replace Irev=220 if I3==222
replace Irev=230 if I3==230
replace Irev=230 if I3==231
replace Irev=230 if I3==232
replace Irev=230 if I3==241
replace Irev=242 if I3==242
replace Irev=250 if I3==250
replace Irev=250 if I3==251
replace Irev=250 if I3==252
replace Irev=250 if I3==261
replace Irev=250 if I3==262
replace Irev=270 if I3==270
replace Irev=270 if I3==271
replace Irev=270 if I3==272
replace Irev=270 if I3==280
replace Irev=270 if I3==281
replace Irev=270 if I3==282
replace Irev=270 if I3==290
replace Irev=270 if I3==291
replace Irev=270 if I3==292
replace Irev=270 if I3==300
replace Irev=270 if I3==301
replace Irev=310 if I3==310
replace Irev=310 if I3==311
replace Irev=310 if I3==312
replace Irev=310 if I3==320
replace Irev=310 if I3==321
replace Irev=310 if I3==322
replace Irev=310 if I3==331
replace Irev=310 if I3==332
replace Irev=340 if I3==340
replace Irev=340 if I3==341
replace Irev=340 if I3==342
replace Irev=340 if I3==350
replace Irev=350 if I3==351
replace Irev=350 if I3==352
replace Irev=350 if I3==360
replace Irev=350 if I3==361
replace Irev=350 if I3==362
replace Irev=350 if I3==370
replace Irev=370 if I3==371
replace Irev=370 if I3==372
replace Irev=370 if I3==380
replace Irev=370 if I3==381
replace Irev=390 if I3==390
replace Irev=391 if I3==391
replace Irev=392 if I3==392
replace Irev=0 if I3==400
replace Irev=401 if I3==401
replace Irev=402 if I3==402
replace Irev=410 if I3==410
replace Irev=411 if I3==411
replace Irev=0 if I3==412
replace Irev=420 if I3==420
replace Irev=420 if I3==421
replace Irev=420 if I3==422
replace Irev=420 if I3==432
replace Irev=440 if I3==440
replace Irev=440 if I3==441
replace Irev=440 if I3==442
replace Irev=471 if I3==450
replace Irev=471 if I3==451
replace Irev=471 if I3==452
replace Irev=471 if I3==470
replace Irev=471 if I3==471
replace Irev=471 if I3==472
replace Irev=500 if I3==500
replace Irev=501 if I3==501
replace Irev=502 if I3==502
replace Irev=510 if I3==510
replace Irev=511 if I3==511
replace Irev=512 if I3==512
replace Irev=521 if I3==521
replace Irev=510 if I3==530
replace Irev=531 if I3==531
replace Irev=532 if I3==532
replace Irev=540 if I3==540
replace Irev=541 if I3==541
replace Irev=542 if I3==542
replace Irev=550 if I3==550
replace Irev=551 if I3==551
replace Irev=552 if I3==552
replace Irev=560 if I3==560
replace Irev=561 if I3==561
replace Irev=562 if I3==562
replace Irev=571 if I3==571
replace Irev=580 if I3==580
replace Irev=581 if I3==581
replace Irev=582 if I3==582
replace Irev=0 if I3==590
replace Irev=591 if I3==591
replace Irev=591 if I3==592
replace Irev=591 if I3==600
replace Irev=601 if I3==601
replace Irev=601 if I3==602
replace Irev=610 if I3==610
replace Irev=611 if I3==611
replace Irev=612 if I3==612
replace Irev=620 if I3==620
replace Irev=621 if I3==621
replace Irev=622 if I3==622
replace Irev=623 if I3==623
replace Irev=630 if I3==630
replace Irev=631 if I3==631
replace Irev=633 if I3==632
replace Irev=633 if I3==633
replace Irev=682 if I3==640
replace Irev=641 if I3==641
replace Irev=642 if I3==642
replace Irev=650 if I3==650
replace Irev=651 if I3==651
replace Irev=652 if I3==652
replace Irev=660 if I3==660
replace Irev=682 if I3==661
replace Irev=662 if I3==662
replace Irev=663 if I3==663
replace Irev=670 if I3==670
replace Irev=671 if I3==671
replace Irev=672 if I3==672
replace Irev=681 if I3==681
replace Irev=682 if I3==682
replace Irev=691 if I3==691
replace Irev=700 if I3==700
replace Irev=702 if I3==701
replace Irev=702 if I3==702
replace Irev=710 if I3==710
replace Irev=711 if I3==711
replace Irev=712 if I3==712
replace Irev=721 if I3==721
replace Irev=722 if I3==722
replace Irev=731 if I3==731
replace Irev=732 if I3==732
replace Irev=740 if I3==740
replace Irev=741 if I3==741
replace Irev=750 if I3==742
replace Irev=750 if I3==750
replace Irev=751 if I3==751
replace Irev=752 if I3==752
replace Irev=760 if I3==760
replace Irev=761 if I3==761
replace Irev=762 if I3==762
replace Irev=770 if I3==770
replace Irev=771 if I3==771
replace Irev=772 if I3==772
replace Irev=780 if I3==780
replace Irev=781 if I3==781
replace Irev=791 if I3==782
replace Irev=791 if I3==790
replace Irev=791 if I3==791
replace Irev=800 if I3==800
replace Irev=800 if I3==801
replace Irev=810 if I3==802
replace Irev=810 if I3==810
replace Irev=812 if I3==812
replace Irev=820 if I3==820
replace Irev=821 if I3==821
replace Irev=822 if I3==822
replace Irev=812 if I3==830
replace Irev=831 if I3==831
replace Irev=832 if I3==832
replace Irev=840 if I3==840
replace Irev=841 if I3==841
replace Irev=842 if I3==842
replace Irev=850 if I3==850
replace Irev=850 if I3==851
replace Irev=860 if I3==852
replace Irev=860 if I3==860
replace Irev=860 if I3==861
replace Irev=862 if I3==862
replace Irev=862 if I3==863
replace Irev=870 if I3==870
replace Irev=871 if I3==871
replace Irev=872 if I3==872
replace Irev=0 if I3==873
replace Irev=0 if I3==880
replace Irev=0 if I3==881
replace Irev=882 if I3==882
replace Irev=890 if I3==890
replace Irev=891 if I3==891
replace Irev=892 if I3==892
replace Irev=893 if I3==893

label define Irevlbl 10 "Agricultural production, crops", add
label define Irevlbl 11 "Veterinary and livestock services", add
label define Irevlbl 20 "Landscaping and misc. non-production agricultural services", add
label define Irevlbl 30 "Forestry, fishing and hunting", add
label define Irevlbl 40 "Mining", add
label define Irevlbl 60 "All construction", add
label define Irevlbl 100 "Food and kindred products", add
label define Irevlbl 132 "Textile mill products", add
label define Irevlbl 150 "Apparel and other finished textile products", add
label define Irevlbl 160 "Paper and allied products", add
label define Irevlbl 170 "Printing, publishing, and allied industries:", add
label define Irevlbl 180 "Chemicals and allied products", add
label define Irevlbl 200 "Petroleum and coal products", add
label define Irevlbl 210 "Rubber and misc. plastics products", add
label define Irevlbl 220 "Leather and leather products", add
label define Irevlbl 230 "Lumber and wood products, ex furniture", add
label define Irevlbl 242 "Furniture and fixtures", add
label define Irevlbl 250 "Stone, clay, glass and concrete products", add
label define Irevlbl 270 "Metal industries", add
label define Irevlbl 310 "Machinery and computing equipment", add
label define Irevlbl 340 "Electrical machinery, equipment, and supplies", add
label define Irevlbl 350 "Transportation equipment", add
label define Irevlbl 370 "Professional and photographic equipment, and watches", add
label define Irevlbl 390 "Toys, amusement, and sporting goods", add
label define Irevlbl 391 "Miscellaneous manufacturing industries", add
label define Irevlbl 392 "Manufacturing industries, n.s.", add
label define Irevlbl 401 "Bus service and urban transit", add
label define Irevlbl 402 "Taxicab service", add
label define Irevlbl 410 "Trucking service", add
label define Irevlbl 411 "Warehousing and storage", add
label define Irevlbl 420 "Miscellaneous transportation", add
label define Irevlbl 440 "Communications", add
label define Irevlbl 471 "Utilities and sanitary services", add
label define Irevlbl 500 "Motor vehicles and equipment", add
label define Irevlbl 501 "Furniture and home furnishings", add
label define Irevlbl 502 "Lumber and construction materials", add
label define Irevlbl 510 "Machinery, equipment, and supplies", add
label define Irevlbl 511 "Metals and minerals, except petroleum", add
label define Irevlbl 512 "Electrical goods", add
label define Irevlbl 521 "Hardware, plumbing and heating supplies", add
label define Irevlbl 531 "Scrap and waste materials", add
label define Irevlbl 532 "Miscellaneous wholesale, durable goods", add
label define Irevlbl 540 "Paper and paper products", add
label define Irevlbl 541 "Drugs, chemicals, and allied products", add
label define Irevlbl 542 "Apparel, fabrics, and notions", add
label define Irevlbl 550 "Groceries and related products", add
label define Irevlbl 551 "Farm-product raw materials", add
label define Irevlbl 552 "Petroleum products", add
label define Irevlbl 560 "Alcoholic beverages", add
label define Irevlbl 561 "Farm supplies", add
label define Irevlbl 562 "Miscellaneous wholesale, nondurable goods", add
label define Irevlbl 571 "Wholesale trade, n.s.", add
label define Irevlbl 580 "Lumber and building material retailing", add
label define Irevlbl 581 "Hardware stores", add
label define Irevlbl 582 "Retail nurseries and garden stores", add
label define Irevlbl 591 "Misc. merchandise stores", add
label define Irevlbl 601 "Grocery stores", add
label define Irevlbl 610 "Retail bakeries", add
label define Irevlbl 611 "Food stores, n.e.c.", add
label define Irevlbl 612 "Motor vehicle dealers", add
label define Irevlbl 620 "Auto and home supply stores", add
label define Irevlbl 621 "Gasoline service stations", add
label define Irevlbl 622 "Miscellaneous vehicle dealers", add
label define Irevlbl 623 "Apparel and accessory stores, except shoe", add
label define Irevlbl 630 "Shoe stores", add
label define Irevlbl 631 "Furniture and home furnishings stores", add
label define Irevlbl 633 "Household appliance stores", add
label define Irevlbl 682 "Misc. retail stores", add
label define Irevlbl 641 "Eating and drinking places", add
label define Irevlbl 642 "Drug stores", add
label define Irevlbl 650 "Liquor stores", add
label define Irevlbl 651 "Sporting goods, bicycles, and hobby stores", add
label define Irevlbl 652 "Book and stationery stores", add
label define Irevlbl 660 "Jewelry stores", add
label define Irevlbl 662 "Sewing, needlework, and piece goods stores", add
label define Irevlbl 663 "Catalog and mail order houses", add
label define Irevlbl 670 "Vending machine operators", add
label define Irevlbl 671 "Direct selling establishments", add
label define Irevlbl 672 "Fuel dealers", add
label define Irevlbl 681 "Retail florists", add
label define Irevlbl 691 "Retail trade, n.s.", add
label define Irevlbl 700 "Banking", add
label define Irevlbl 702 "Savings and credit institutions", add
label define Irevlbl 710 "Security, commodity brokerage, and investment companies", add
label define Irevlbl 711 "Insurance", add
label define Irevlbl 712 "Real estate, including real estate-insurance offices", add
label define Irevlbl 721 "Advertising", add
label define Irevlbl 722 "Services to dwellings and other buildings", add
label define Irevlbl 731 "Personnel supply services", add
label define Irevlbl 732 "Computer and data processing services", add
label define Irevlbl 740 "Detective and protective services", add
label define Irevlbl 741 "Business services, n.e.c.", add
label define Irevlbl 750 "Automotive renting and parking", add
label define Irevlbl 751 "Automotive repair and related services", add
label define Irevlbl 752 "Electrical repair shops", add
label define Irevlbl 760 "Miscellaneous repair services", add
label define Irevlbl 761 "Private households", add
label define Irevlbl 762 "Hotels and motels", add
label define Irevlbl 770 "Lodging places, except hotels and motels", add
label define Irevlbl 771 "Laundry, cleaning, and garment services", add
label define Irevlbl 772 "Beauty shops", add
label define Irevlbl 780 "Barber shops", add
label define Irevlbl 781 "Funeral service and crematories", add
label define Irevlbl 791 "Misc. personal services", add
label define Irevlbl 800 "Theaters and video rental", add
label define Irevlbl 810 "Miscellaneous entertainment and recreation services", add
label define Irevlbl 812 "Offices and clinics of physicians and misc. health practitioners", add
label define Irevlbl 820 "Offices and clinics of dentists", add
label define Irevlbl 821 "Offices and clinics of chiropractors", add
label define Irevlbl 822 "Offices and clinics of optometrists", add
label define Irevlbl 831 "Hospitals", add
label define Irevlbl 832 "Nursing and personal care facilities", add
label define Irevlbl 840 "Health services, n.e.c.", add
label define Irevlbl 841 "Legal services", add
label define Irevlbl 842 "Elementary and secondary schools", add
label define Irevlbl 850 "Educational institutions", add
label define Irevlbl 860 "Educational services", add
label define Irevlbl 862 "Child care services", add
label define Irevlbl 870 "Residential care facilities, without nursing", add
label define Irevlbl 871 "Social services, n.e.c.", add
label define Irevlbl 872 "Museums, art galleries, and zoos", add
label define Irevlbl 882 "Engineering, architectural, and surveying services", add
label define Irevlbl 890 "Accounting, auditing, and bookkeeping services", add
label define Irevlbl 891 "Research, development, and testing services", add
label define Irevlbl 892 "Management and public relations services", add
label define Irevlbl 893 "Miscellaneous professional and related services", add
label values Irev Irevlbl
drop if Irev==0 | Irev==.
drop ind1990 I3
ren Irev ind1990
save ./temp/1a-US-Master2.dta, replace

*********************************************
**Build the core worker samples for analysis**
*********************************************

* Identify very low self employment industries (none actually dropped)
use year ind1990 classwkr if classwkr==1 using ./temp/1a-US-Master2.dta, clear
gen ct=1
collapse (sum) ct, by(ind1990 year)
reshape wide ct, i(ind1990) j(year)
for var ct*: replace X=0 if X==.
egen ctmin=rmin(ct*)
gsort -ctmin
keep if ctmin<=50
list, clean noobs
keep ind1990
sort ind1990
save ./temp/ind-drops, replace

* Build worker sample
foreach x in 1980 1990 2000 2010 2018 {
use if year==`x' using ./temp/1a-US-Master2.dta, clear
g valid=(sex==1 & approx_age>=16 & age>=22 & age<=70 & ethn!=1 & classwkr!=0)
g validus=(sex==1 & age>=22 & age<=70 & ethn==1 & classwkr!=0)
keep if (valid==1 | validus==1)
drop if gq==3 | gq==4
sort ind1990
merge m:1 ind1990 using ./temp/ind-drops
keep if _m==1
drop _m
keep serial pernum perwt ethn ind1990 classwkr yrimmig metaread
compress
save ./temp/`x'-WorkerResults.dta, replace
}
erase ./temp/ind-drops.dta

* Sample to export to Unix and wage analysis
use perwt puma age yrimmig speakeng educ educd ind1990 classwkr classwkrd inctot incbus00 approx_age ethn sex metaread year if year==2000 using ./temp/1a-US-Master2.dta, clear
drop year
save ./temp/1a-US-Master2-Wage.dta, replace

******************************************************************
** CORE [1]: Self-employed, no limit on min observations        **
******************************************************************

foreach x in 1980 1990 2000 2010 2018 {
use if classwkr==1 using ./temp/`x'-WorkerResults.dta, clear
sort ethn ind1990
egen indemp=total(perwt), by(ind1990)
egen emp=total(perwt)
g nat_share=indemp/emp
egen indemp_gp=total(perwt), by(ethn ind1990)
egen emp_gp=total(perwt), by(ethn)
g gp_share=indemp_gp/emp_gp
g overage=gp_share/nat_share
egen obs=count(serial), by(ethn ind1990)
keep ethn ind1990 indemp_gp gp_share overage obs
duplicates drop
gsort ethn -overage
save ./temp/`x'-temp1a, replace

***Calculate OVER1 and OVER2 before the industry drop***
* OVER1 = weighted avg. of overage ratio
* OVER2 = weighted avg. overage ratio of top 3 largest industries
egen temp1=total(indemp_gp), by(ethn)
g temp2=indemp_gp/temp1
g temp3=temp2*overage
egen OVER1=total(temp3), by(ethn)
drop temp*
gsort ethn -indemp_gp -obs -ind1990
by ethn: g rank=_n
keep if rank<=3
egen temp1=total(indemp_gp), by(ethn)
g temp2=indemp_gp/temp1
g temp3=temp2*overage
egen OVER2=total(temp3), by(ethn)
keep if rank==1
ren ind1990 Ibig
keep ethn OVER1 OVER2 Ibig
sort ethn
save ./temp/`x'-temp1b, replace

***Conduct industry-level outliers***
* OVER3 = weighted avg. of top 3 overage ratios
* OVER4 = max overage ratio
use ./temp/`x'-temp1a, clear
gsort ethn -overage
by ethn: g rank=_n
keep if rank<=3
egen temp1=total(indemp_gp), by(ethn)
g temp2=indemp_gp/temp1
g temp3=temp2*overage
egen OVER3 = total(temp3), by(ethn)
keep if rank==1
ren overage OVER4
ren ind1990 Imax
sort ethn
save ./temp/`x'-temp1c, replace

***Remerge files and save***
use ./temp/`x'-temp1b, clear
merge ethn using ./temp/`x'-temp1c
g self=1
keep ethn self OVER* I*
order ethn self OVER* I*
compress
sort ethn
save ./temp/`x'-temp1-selfemp, replace
}

sleep 5000
foreach x in 1980 1990 2000 2010 2018 {
for any a b c: erase ./temp/`x'-temp1X.dta
}

******************************************************************
** EXTENSION [0]: All employees                                 **
******************************************************************

foreach x in 1980 1990 2000 2010 2018 {
use ./temp/`x'-WorkerResults.dta, clear
sort ethn ind1990
egen indemp=total(perwt), by(ind1990)
egen emp=total(perwt)
g nat_share=indemp/emp
egen indemp_gp=total(perwt), by(ethn ind1990)
egen emp_gp=total(perwt), by(ethn)
g gp_share=indemp_gp/emp_gp
g overage=gp_share/nat_share
egen obs=count(serial), by(ethn ind1990)
keep ethn ind1990 indemp_gp gp_share overage obs
duplicates drop
gsort ethn -overage
save ./temp/`x'-temp1a, replace

***Calculate OVER1 and OVER2 before the industry drop***
egen temp1=total(indemp_gp), by(ethn)
g temp2=indemp_gp/temp1
g temp3=temp2*overage
egen OVER1=total(temp3), by(ethn)
drop temp*
gsort ethn -indemp_gp -obs -ind1990
by ethn: g rank=_n
keep if rank<=3
egen temp1=total(indemp_gp), by(ethn)
g temp2=indemp_gp/temp1
g temp3=temp2*overage
egen OVER2=total(temp3), by(ethn)
keep if rank==1
ren ind1990 Ibig
keep ethn OVER1 OVER2 Ibig
sort ethn
save ./temp/`x'-temp1b, replace

***Conduct industry-level outliers***
use ./temp/`x'-temp1a, clear
gsort ethn -overage
by ethn: g rank=_n
keep if rank<=3
egen temp1=total(indemp_gp), by(ethn)
g temp2=indemp_gp/temp1
g temp3=temp2*overage
egen OVER3 = total(temp3), by(ethn)
keep if rank==1
ren overage OVER4
ren ind1990 Imax
sort ethn
save ./temp/`x'-temp1c, replace

***Remerge files and save***
use ./temp/`x'-temp1b, clear
merge ethn using ./temp/`x'-temp1c
g self=0
keep ethn self OVER* I*
order ethn self OVER* I*
compress
sort ethn
save ./temp/`x'-temp1-allemp, replace
}

sleep 5000
foreach x in 1980 1990 2000 2010 2018 {
for any a b c: erase ./temp/`x'-temp1X.dta
}

******************************************************************
** EXTENSION [2]: Recent Arrivals Excluded                      **
******************************************************************

foreach x in 1980 1990 2000 2010 2018 {
use ./temp/`x'-WorkerResults.dta, clear
drop if (ethn!=1 & yrimmig>`x'-5)
sort ethn ind1990
egen indemp=total(perwt), by(ind1990)
egen emp=total(perwt)
g nat_share=indemp/emp
egen indemp_gp=total(perwt), by(ethn ind1990)
egen emp_gp=total(perwt), by(ethn)
g gp_share=indemp_gp/emp_gp
g overage=gp_share/nat_share
egen obs=count(serial), by(ethn ind1990)
keep ethn ind1990 indemp_gp gp_share overage obs
duplicates drop
gsort ethn -overage
save ./temp/`x'-temp1a, replace

***Calculate OVER1 and OVER2 before the industry drop***
egen temp1=total(indemp_gp), by(ethn)
g temp2=indemp_gp/temp1
g temp3=temp2*overage
egen OVER1=total(temp3), by(ethn)
drop temp*
gsort ethn -indemp_gp -obs -ind1990
by ethn: g rank=_n
keep if rank<=3
egen temp1=total(indemp_gp), by(ethn)
g temp2=indemp_gp/temp1
g temp3=temp2*overage
egen OVER2=total(temp3), by(ethn)
keep if rank==1
ren ind1990 Ibig
keep ethn OVER1 OVER2 Ibig
sort ethn
save ./temp/`x'-temp1b, replace

***Conduct industry-level outliers***
use ./temp/`x'-temp1a, clear
gsort ethn -overage
by ethn: g rank=_n
keep if rank<=3
egen temp1=total(indemp_gp), by(ethn)
g temp2=indemp_gp/temp1
g temp3=temp2*overage
egen OVER3 = total(temp3), by(ethn)
keep if rank==1
ren overage OVER4
ren ind1990 Imax
sort ethn
save ./temp/`x'-temp1c, replace

***Remerge files and save***
use ./temp/`x'-temp1b, clear
merge ethn using ./temp/`x'-temp1c
g self=2
keep ethn self OVER* I*
order ethn self OVER* I*
compress
sort ethn
save ./temp/`x'-temp1-norecent, replace
}

sleep 5000
foreach x in 1980 1990 2000 2010 2018 {
for any a b c: erase ./temp/`x'-temp1X.dta
}

******************************************************************
** EXTENSION [3]: No Natives                                    **
******************************************************************

foreach x in 1980 1990 2000 2010 2018 {
use if ethn!=1 using ./temp/`x'-WorkerResults.dta, clear
sort ethn ind1990
egen indemp=total(perwt), by(ind1990)
egen emp=total(perwt)
g nat_share=indemp/emp
egen indemp_gp=total(perwt), by(ethn ind1990)
egen emp_gp=total(perwt), by(ethn)
g gp_share=indemp_gp/emp_gp
g overage=gp_share/nat_share
egen obs=count(serial), by(ethn ind1990)
keep ethn ind1990 indemp_gp gp_share overage obs
duplicates drop
gsort ethn -overage
save ./temp/`x'-temp1a, replace

***Calculate OVER1 and OVER2 before the industry drop***
egen temp1=total(indemp_gp), by(ethn)
g temp2=indemp_gp/temp1
g temp3=temp2*overage
egen OVER1=total(temp3), by(ethn)
drop temp*
gsort ethn -indemp_gp -obs -ind1990
by ethn: g rank=_n
keep if rank<=3
egen temp1=total(indemp_gp), by(ethn)
g temp2=indemp_gp/temp1
g temp3=temp2*overage
egen OVER2=total(temp3), by(ethn)
keep if rank==1
ren ind1990 Ibig
keep ethn OVER1 OVER2 Ibig
sort ethn
save ./temp/`x'-temp1b, replace

***Conduct industry-level outliers***
use ./temp/`x'-temp1a, clear
gsort ethn -overage
by ethn: g rank=_n
keep if rank<=3
egen temp1=total(indemp_gp), by(ethn)
g temp2=indemp_gp/temp1
g temp3=temp2*overage
egen OVER3 = total(temp3), by(ethn)
keep if rank==1
ren overage OVER4
ren ind1990 Imax
sort ethn
save ./temp/`x'-temp1c, replace

***Remerge files and save***
use ./temp/`x'-temp1b, clear
merge ethn using ./temp/`x'-temp1c
g self=3
keep ethn self OVER* I*
order ethn self OVER* I*
compress
sort ethn
save ./temp/`x'-temp1-nonatives, replace
}

sleep 5000
foreach x in 1980 1990 2000 2010 2018 {
for any a b c: erase ./temp/`x'-temp1X.dta
}

******************************************************************
** EXTENSION [4]: Restricted to min obs in cells for outliers   **
******************************************************************

foreach x in 1980 1990 2000 2010 2018 {
use if ethn!=1 using ./temp/`x'-WorkerResults.dta, clear
sort ethn ind1990
egen indemp=total(perwt), by(ind1990)
egen emp=total(perwt)
g nat_share=indemp/emp
egen indemp_gp=total(perwt), by(ethn ind1990)
egen emp_gp=total(perwt), by(ethn)
g gp_share=indemp_gp/emp_gp
g overage=gp_share/nat_share
egen obs=count(serial), by(ethn ind1990)
keep ethn ind1990 indemp_gp gp_share overage obs
drop if obs<3
duplicates drop
gsort ethn -overage
save ./temp/`x'-temp1a, replace

***Calculate OVER1 and OVER2 before the industry drop***
egen temp1=total(indemp_gp), by(ethn)
g temp2=indemp_gp/temp1
g temp3=temp2*overage
egen OVER1=total(temp3), by(ethn)
drop temp*
gsort ethn -indemp_gp -obs -ind1990
by ethn: g rank=_n
keep if rank<=3
egen temp1=total(indemp_gp), by(ethn)
g temp2=indemp_gp/temp1
g temp3=temp2*overage
egen OVER2=total(temp3), by(ethn)
keep if rank==1
ren ind1990 Ibig
keep ethn OVER1 OVER2 Ibig
sort ethn
save ./temp/`x'-temp1b, replace

***Conduct industry-level outliers***
use ./temp/`x'-temp1a, clear
gsort ethn -overage
by ethn: g rank=_n
keep if rank<=3
egen temp1=total(indemp_gp), by(ethn)
g temp2=indemp_gp/temp1
g temp3=temp2*overage
egen OVER3 = total(temp3), by(ethn)
keep if rank==1
ren overage OVER4
ren ind1990 Imax
sort ethn
save ./temp/`x'-temp1c, replace

***Remerge files and save***
use ./temp/`x'-temp1b, clear
merge ethn using ./temp/`x'-temp1c
g self=4
keep ethn self OVER* I*
order ethn self OVER* I*
compress
sort ethn
save ./temp/`x'-temp1-restricted, replace
}

sleep 5000
foreach x in 1980 1990 2000 2010 2018 {
for any a b c: erase ./temp/`x'-temp1X.dta
}

******************************************************************
** EXTENSION [5]: No Taxi                                       **
******************************************************************

foreach x in 1980 1990 2000 2010 2018 {
use ./temp/`x'-WorkerResults.dta, clear
drop if ind1990==402
sort ethn ind1990
egen indemp=total(perwt), by(ind1990)
egen emp=total(perwt)
g nat_share=indemp/emp
egen indemp_gp=total(perwt), by(ethn ind1990)
egen emp_gp=total(perwt), by(ethn)
g gp_share=indemp_gp/emp_gp
g overage=gp_share/nat_share
egen obs=count(serial), by(ethn ind1990)
keep ethn ind1990 indemp_gp gp_share overage obs
duplicates drop
gsort ethn -overage
save ./temp/`x'-temp1a, replace

***Calculate OVER1 and OVER2 before the industry drop***
egen temp1=total(indemp_gp), by(ethn)
g temp2=indemp_gp/temp1
g temp3=temp2*overage
egen OVER1=total(temp3), by(ethn)
drop temp*
gsort ethn -indemp_gp -obs -ind1990
by ethn: g rank=_n
keep if rank<=3
egen temp1=total(indemp_gp), by(ethn)
g temp2=indemp_gp/temp1
g temp3=temp2*overage
egen OVER2=total(temp3), by(ethn)
keep if rank==1
ren ind1990 Ibig
keep ethn OVER1 OVER2 Ibig
sort ethn
save ./temp/`x'-temp1b, replace

***Conduct industry-level outliers***
use ./temp/`x'-temp1a, clear
gsort ethn -overage
by ethn: g rank=_n
keep if rank<=3
egen temp1=total(indemp_gp), by(ethn)
g temp2=indemp_gp/temp1
g temp3=temp2*overage
egen OVER3 = total(temp3), by(ethn)
keep if rank==1
ren overage OVER4
ren ind1990 Imax
sort ethn
save ./temp/`x'-temp1c, replace

***Remerge files and save***
use ./temp/`x'-temp1b, clear
merge ethn using ./temp/`x'-temp1c
g self=5
keep ethn self OVER* I*
order ethn self OVER* I*
compress
sort ethn
save ./temp/`x'-temp1-notaxi, replace
}

sleep 5000
foreach x in 1980 1990 2000 2010 2018 {
for any a b c: erase ./temp/`x'-temp1X.dta
}

****************************
**Combine overage datasets**
****************************
*Create overage file

foreach x in 1980 1990 2000 2010 2018 {
use ./temp/`x'-temp1-selfemp.dta, clear
erase ./temp/`x'-temp1-selfemp.dta
append using ./temp/`x'-temp1-allemp.dta
erase ./temp/`x'-temp1-allemp.dta
append using ./temp/`x'-temp1-norecent.dta
erase ./temp/`x'-temp1-norecent.dta
append using ./temp/`x'-temp1-nonatives.dta
erase ./temp/`x'-temp1-nonatives.dta
append using ./temp/`x'-temp1-restricted.dta
erase ./temp/`x'-temp1-restricted.dta
append using ./temp/`x'-temp1-notaxi.dta
erase ./temp/`x'-temp1-notaxi.dta

label var ethn "Ethnicity"
label var self "1 if self-emp"
label var OVER1 "Avg. overage"
label var OVER2 "Avg. industries"
label var OVER3 "Top 3 overage"
label var OVER4 "Max overage"
label var Ibig "Largest industry"
label var Imax "Max overage industry"
order ethn self OVER1 OVER2 OVER3 OVER4 I*
gsort ethn self
compress
save ./temp/`x'-OverageResults.dta, replace
}

************************************
**Build the US in-marriage files  **
************************************
* Use original file pre-industry drops

foreach x in 1980 1990 2000 2010 2018 {
use if year==`x' & ethn!=1 & sploc!=0 using ./temp/1a-US-Master.dta, clear
keep if (approx_age>=0 & approx_age<=15) & (age>=22 & age<=70)
keep serial ethn sex ancestr1 perwt sploc
ren sploc pernum
ren ethn sp_ethn
ren sex sp_sex
ren ancestr1 sp_ancestr
compress
save ./temp/sploc`x'.dta, replace
use if year==`x' using ./temp/1a-US-Master.dta, clear
merge 1:1 serial pernum using ./temp/sploc`x', keepus(perwt sp_ethn sp_sex sp_ancestr)
keep if _m==3
drop _m
erase ./temp/sploc`x'.dta
g imr=(ethn==sp_ethn) | (ancestr1==sp_ancestr)
g imrx=(ethn==sp_ethn)
for any imr imrx: replace X=X*perwt
collapse (sum) imr imrx perwt, by(sp_ethn) fast
for any imr imrx: replace X=X/perwt
ren sp_ethn ethn
ren perwt obs
g int census=`x'
save ./temp/img`x'.dta, replace
}

sleep 5000
use ./temp/img1980.dta, replace
for any 1990 2000 2010 2018: append using ./temp/imgX.dta
for any 1980 1990 2000 2010 2018: erase ./temp/imgX.dta
sort ethn census
ren imr iso
ren imrx isox
label var ethn "Ethnicity"
label var census "Census Year"
label var iso "In-Marriage Rate"
label var isox "In-Marriage Rate Strict"
label data "In-Marriage Rates"
compress
save ./temp/IMR-US.dta, replace

*************************
**Build the UK imr file**
*************************
* No age at immigration data

import excel ./data-input/Groups.xlsx, sheet("uk") firstrow clear
save ./temp/ukmapping.dta, replace

use if ethn!=1 & sploc!=0 using .\temp\1b-UK-Master.dta, clear
keep if (age>=22 & age<=70)
keep serial ethn sex perwt sploc
ren sploc pernum
ren ethn sp_ethn
ren sex sp_sex
compress
save ./temp/1991_UK_IPUMS_sploc.dta, replace
use ./temp/1b-UK-Master.dta, clear
merge 1:1 serial pernum using ./temp/1991_UK_IPUMS_sploc, keepus(perwt sp_ethn sp_sex)
keep if _m==3
drop _m
erase ./temp/1991_UK_IPUMS_sploc.dta
g imr=(ethn==sp_ethn)
replace imr=imr*perwt
collapse (sum) imr perwt, by(sp_ethn) fast
replace imr=imr/perwt
ren sp_ethn ethn
ren perwt obs
compress
sort ethn
ren imr iso1991
label var ethn "Ethnicity"
label var iso "In-Marriage Rate"
label data "In-Marriage Rates"
ren obs obs1991
gsort -iso1991
merge 1:m ethn using ./temp/ukmapping.dta
erase ./temp/ukmapping.dta
rename ethn ukctry
rename masterethn ethn
rename ethnlbl ukctrylbl
rename masterethnlbl ethnlbl
keep if _m==3
replace iso1991=. if ethn==41000
drop _m obs1991
order ethn ethnlbl ukctry ukctrylbl iso1991
gsort -iso1991
save ./temp/IMR-UK.dta, replace

****************************
**Build the Spain imr file**
****************************

import excel ./data-input/Groups.xlsx, sheet("spain") firstrow clear
save ./temp/SpainMapping.dta, replace

use if ethn!=1 & ethn!=43120 & ethn!=43121 & sploc!=0 using .\temp\1c-Spain-Master, clear
keep if (approx_age>=0 & approx_age<=15) & (age>=22 & age<=70)
keep serial ethn sex perwt sploc
ren sploc pernum
ren ethn sp_ethn
ren sex sp_sex
compress
save ./temp/2011_Spain_IPUMS_sploc.dta, replace
use ./temp/1c-Spain-Master, clear
merge 1:1 serial pernum using ./temp/2011_Spain_IPUMS_sploc, keepus(perwt sp_ethn sp_sex)
keep if _m==3
drop _m
erase ./temp/2011_Spain_IPUMS_sploc.dta
g imr=(ethn==sp_ethn)
replace imr=imr*perwt
collapse (sum) imr perwt, by(sp_ethn) fast
replace imr=imr/perwt
ren sp_ethn ethn
ren perwt obs
compress
sort ethn
ren imr iso2011
label var ethn "Ethnicity"
label var iso "In-Marriage Rate"
label data "In-Marriage Rates"
ren obs obs2011
compress
gsort -iso2011
merge 1:m ethn using ./temp/SpainMapping.dta
erase ./temp/SpainMapping.dta
ren ethn spainctry
ren masterethn ethn
ren ethnlbl spainctrylbl
ren masterethnlbl ethnlbl
drop _m obs2011
order ethn ethnlbl spainctry spainctrylbl iso2011
gsort -iso2011
save ./temp/IMR-Spain.dta, replace

**********************************************************
**Build the genetic and cultural distance files         **
**********************************************************

* Create US focused version
* Replace the missing linguistic and religious distance data for Germany with GDR data
use if country_1=="U.S.A" | country_2=="U.S.A" using ./data-input/cultdist, clear
drop total* cognate* reldist_dominant_WCD_form reldist_weighted_WCD_form
replace wacziarg_1=wacziarg_2 if country_1=="U.S.A"
replace country_1=country_2 if country_1=="U.S.A"
drop wacziarg_2 country_2
sort country_1
ren fst_distance_dominant gplu
ren fst_distance_weighted gwtd
ren lingdist_dom_formula lplu
ren lingdist_weighted_formula lwtd
ren reldist_dominant_formula rplu
ren reldist_weighted_formula rwtd
replace lplu= lplu[66] in 67
replace lwtd= lwtd[66] in 67
replace rplu= rplu[66] in 67
replace rwtd= rwtd[66] in 67
save ./temp/wacziargforus.dta, replace
sleep 5000

import excel ./data-input/Groups.xlsx, sheet("genetic") firstrow clear
save ./temp/wacziargmapping.dta, replace
merge m:1 wacziarg_1 using ./temp/wacziargforus.dta
keep if _m==3
ren wacziarg_1 wacziarggroups
drop _m country_1
gsort -gwtd
save ./temp/wacziarggroups.dta, replace
erase ./temp/wacziargforus.dta

* Create weighted average of an ethn from all other ethn in the composition of the US
* Replace the missing linguistic and religious distance data for Germany with GDR data
use ./data-input/cultdist.dta, clear
ren fst_distance_dominant gplu
ren fst_distance_weighted gwtd
ren lingdist_dom_formula lplu
ren lingdist_weighted_formula lwtd
ren reldist_dominant_formula rplu
ren reldist_weighted_formula rwtd
drop total* cognate* reldist_dominant_WCD_form reldist_weighted_WCD_form
save ./temp/cultdistsimp.dta, replace

keep if wacziarg_1==193 | wacziarg_2==193
replace wacziarg_1=207 if wacziarg_1==193
replace wacziarg_2=207 if wacziarg_2==193
egen waczpair = concat(wacziarg_1 wacziarg_2), punct(-)
egen waczpair2 = concat(wacziarg_2 wacziarg_1), punct(-)
replace waczpair=waczpair2 if wacziarg_1==207
drop gplu gwtd
save ./temp/GDR-pair.dta, replace
sleep 5000

use ./temp/cultdistsimp.dta, clear
egen waczpair = concat(wacziarg_1 wacziarg_2), punct(-)
egen waczpair2 = concat(wacziarg_2 wacziarg_1), punct(-)
save ./temp/cultdist-pair.dta, replace
keep if wacziarg_1==207 | wacziarg_2==207
drop lplu lwtd rplu rwtd
merge 1:1 waczpair using ./temp/GDR-pair.dta
list if _n==208 | _n==119
replace lplu= lplu[208] in 119
replace lwtd= lwtd[208] in 119
replace rplu= rplu[208] in 119
replace rwtd= rwtd[208] in 119
drop if _m==2
drop _m
save ./temp/GDR-pair.dta, replace
sleep 5000

use ./temp/cultdist-pair.dta, clear
drop if wacziarg_1==207 | wacziarg_2==207
append using ./temp/GDR-pair.dta
save ./temp/cultdist-pair.dta, replace
erase ./temp/GDR-pair.dta
erase ./temp/cultdistsimp.dta

use if year==2000 using ./temp/1a-US-Master.dta, clear
collapse (percent) perwt, by(ethn)
gen comp = (perwt)/100
drop perwt
gsort -comp
gen year=2000
save ./temp/USethncomp.dta, replace
ren ethn ethn1
ren comp comp1
joinby year using ./temp/USethncomp.dta
drop year
ren ethn ethn2
ren comp comp2
save ./temp/USethncomp-pair.dta, replace

use ./temp/wacziargmapping.dta, clear
ren ethn ethn1
merge 1:m ethn1 using ./temp/USethncomp-pair.dta
drop _m ethnlbl country_1
ren wacziarg_1 wacziarg1
save ./temp/USethncomp-wacziargpair.dta, replace
sleep 5000

use ./temp/wacziargmapping.dta, clear
ren ethn ethn2
merge 1:m ethn2 using ./temp/USethncomp-wacziargpair.dta
drop _m ethnlbl country_1
ren wacziarg_1 wacziarg2
egen waczpair = concat(wacziarg1 wacziarg2), punct(-)
egen waczpair2 = concat(wacziarg2 wacziarg1), punct(-)
save ./temp/USethncomp-wacziargpair.dta, replace

*the below file has 177-177 (country pair w/itself)
merge m:1 waczpair using ./temp/cultdist-pair.dta
gen missing=1 if wacziarg_1==.
keep if missing==1
keep ethn1 comp1 wacziarg1 ethn2 comp2 wacziarg2 waczpair waczpair2
ren waczpair2 waczpair2ref
ren waczpair waczpair2
merge m:1 waczpair2 using ./temp/cultdist-pair.dta
keep if _m==3
ren country_2 country1
ren country_1 country2
drop _m waczpair wacziarg_1 wacziarg_2
ren waczpair2 waczpair
ren waczpair2ref waczpair2
save ./temp/USethncomp-cultdist-pairreverse.dta, replace

use ./temp/USethncomp-wacziargpair.dta, clear
merge m:1 waczpair using ./temp/cultdist-pair.dta
gen missing=1 if wacziarg_1==.
replace missing=0 if ethn1==ethn2
replace missing=1 if ethn1==. & ethn2==.
drop if missing==1
keep if _m==3 | missing==0
ren country_1 country1
ren country_2 country2
drop _m missing wacziarg_1 wacziarg_2
append using ./temp/USethncomp-cultdist-pairreverse.dta
order ethn1 wacziarg1 ethn2 wacziarg2
sort wacziarg1 wacziarg2
drop waczpair waczpair2

foreach x in gplu gwtd lplu lwtd rplu {
replace `x'=0 if `x'==. & ethn1==ethn2
}
replace rwtd=.089 if rwtd==. & ethn1==ethn2

gen gplucomp = (gplu)*(comp2)
gen gwtdcomp = (gwtd)*(comp2)
gen lplucomp = (lplu)*(comp2)
gen lwtdcomp = (lwtd)*(comp2)
gen rplucomp = (rplu)*(comp2)
gen rwtdcomp = (rwtd)*(comp2)
drop gplu gwtd lplu lwtd rplu rwtd
collapse (sum) gplucomp gwtdcomp lplucomp lwtdcomp rplucomp rwtdcomp (count) n_gplucomp=gplucomp n_gwtdcomp=gwtdcomp n_lplucomp=lplucomp n_lwtdcomp=lwtdcomp n_rplucomp=rplucomp n_rwtdcomp=rwtdcomp, by(ethn1)
replace gplucomp = . if n_gplucomp==1
replace gwtdcomp = . if n_gwtdcomp==1
replace lplucomp = . if n_lplucomp==1
replace lwtdcomp = . if n_lwtdcomp==1
replace rplucomp = . if n_rplucomp==1
replace rwtdcomp = . if n_rwtdcomp==1
drop n_*
ren ethn1 ethn
save ./temp/wacziargwtduscomp.dta, replace

for any cultdist-pair USethncomp USethncomp-cultdist-pairreverse USethncomp-pair USethncomp-wacziargpair wacziargmapping: erase ./temp/X.dta

**************************
**Build the Master files**
**************************

* Size file
use ethn perwt year using ./temp/1a-US-Master2, clear
collapse (sum) perwt, by(ethn year) fast
gen size=ln(perwt)
keep ethn size year
reshape wide size, i(ethn) j(year)
sort ethn
save ./temp/US-Size.dta, replace

* Gravity files
import excel ./data-input/Groups.xlsx, sheet("distpop") firstrow clear
sort ethnmerge
save ./temp/distpop.dta, replace
import excel ./data-input/Groups.xlsx, sheet("gravity") firstrow clear
save ./temp/distpopmapping.dta, replace
use ./temp/distpop.dta, clear
merge 1:m ethnmerge using ./temp/distpopmapping.dta
keep if _m==3
drop _m ethnmerge
order ethn ethnlbl
save ./temp/US-Gravity.dta, replace
sleep 5000
erase ./temp/distpop.dta
erase ./temp/distpopmapping.dta

* Weights
foreach x in 1980 1990 2000 2010 2018 {
use ethn perwt using ./temp/`x'-WorkerResults.dta, clear
collapse (sum) perwt, by (ethn) fast
gen weight=ln(perwt)
keep ethn weight
sort ethn
save ./temp/`x'-weights.dta, replace
}

* Create Masters
foreach x in 1980 1990 2000 2010 2018 {
use ./temp/IMR-US, clear
drop obs
reshape wide iso isox, i(ethn) j(census)
keep ethn iso*
merge 1:m ethn using ./temp/`x'-OverageResults.dta
keep if _m==3
drop _m
merge m:1 ethn using ./temp/`x'-weights.dta
keep if _m==3
drop _m
decode ethn, gen(ethnmerge)
merge m:1 ethn using ./temp/US-Gravity
keep if _m==3
drop _m ethnlbl
merge m:1 ethn using ./temp/US-Size
keep if _m==3
drop _m

merge m:1 ethn using ./temp/IMR-UK.dta
keep if _m==3
drop _m ethnlbl ukctrylbl
label define ukctrylbl 00000 `"NIU (not in universe)"'
label define ukctrylbl 10000 `"Africa"', add
label define ukctrylbl 11000 `"Eastern Africa"', add
label define ukctrylbl 11010 `"Burundi"', add
label define ukctrylbl 11020 `"Comoros"', add
label define ukctrylbl 11030 `"Djibouti"', add
label define ukctrylbl 11040 `"Eritrea"', add
label define ukctrylbl 11050 `"Ethiopia"', add
label define ukctrylbl 11051 `"Ethiopia (including Eritrea)"', add
label define ukctrylbl 11060 `"Kenya"', add
label define ukctrylbl 11070 `"Madagascar"', add
label define ukctrylbl 11080 `"Malawi"', add
label define ukctrylbl 11090 `"Mauritius"', add
label define ukctrylbl 11100 `"Mozambique"', add
label define ukctrylbl 11110 `"Reunion"', add
label define ukctrylbl 11120 `"Rwanda"', add
label define ukctrylbl 11130 `"Seychelles"', add
label define ukctrylbl 11140 `"Somalia"', add
label define ukctrylbl 11150 `"South Sudan"', add
label define ukctrylbl 11160 `"Uganda"', add
label define ukctrylbl 11170 `"Tanzania"', add
label define ukctrylbl 11180 `"Zambia"', add
label define ukctrylbl 11190 `"Zimbabwe"', add
label define ukctrylbl 11990 `"Eastern Africa, n.s."', add
label define ukctrylbl 12000 `"Middle Africa"', add
label define ukctrylbl 12010 `"Angola"', add
label define ukctrylbl 12020 `"Cameroon"', add
label define ukctrylbl 12030 `"Central African Republic"', add
label define ukctrylbl 12040 `"Chad"', add
label define ukctrylbl 12050 `"Congo"', add
label define ukctrylbl 12060 `"Democratic Republic of Congo"', add
label define ukctrylbl 12070 `"Equatorial Guinea"', add
label define ukctrylbl 12080 `"Gabon"', add
label define ukctrylbl 12090 `"Sao Tome and Principe"', add
label define ukctrylbl 12990 `"Middle Africa, n.s."', add
label define ukctrylbl 13000 `"Northern Africa"', add
label define ukctrylbl 13010 `"Algeria"', add
label define ukctrylbl 13011 `"Algeria/Tunisia"', add
label define ukctrylbl 13020 `"Egypt"', add
label define ukctrylbl 13021 `"Egypt/Sudan"', add
label define ukctrylbl 13030 `"Libya"', add
label define ukctrylbl 13040 `"Morocco"', add
label define ukctrylbl 13050 `"Sudan"', add
label define ukctrylbl 13060 `"Tunisia"', add
label define ukctrylbl 13070 `"Western Sahara"', add
label define ukctrylbl 13990 `"Northern Africa, n.s."', add
label define ukctrylbl 14000 `"Southern Africa"', add
label define ukctrylbl 14010 `"Botswana"', add
label define ukctrylbl 14020 `"Lesotho"', add
label define ukctrylbl 14030 `"Namibia"', add
label define ukctrylbl 14040 `"South Africa"', add
label define ukctrylbl 14050 `"Swaziland"', add
label define ukctrylbl 14990 `"Southern Africa, n.s."', add
label define ukctrylbl 15000 `"Western Africa"', add
label define ukctrylbl 15010 `"Benin"', add
label define ukctrylbl 15020 `"Burkina Faso"', add
label define ukctrylbl 15021 `"Upper Volta"', add
label define ukctrylbl 15030 `"Cape Verde"', add
label define ukctrylbl 15040 `"Ivory Coast"', add
label define ukctrylbl 15050 `"Gambia"', add
label define ukctrylbl 15060 `"Ghana"', add
label define ukctrylbl 15070 `"Guinea"', add
label define ukctrylbl 15080 `"Guinea-Bissau"', add
label define ukctrylbl 15081 `"Guinea-Bissau and Cape Verde"', add
label define ukctrylbl 15090 `"Liberia"', add
label define ukctrylbl 15100 `"Mali"', add
label define ukctrylbl 15110 `"Mauritania"', add
label define ukctrylbl 15120 `"Niger"', add
label define ukctrylbl 15130 `"Nigeria"', add
label define ukctrylbl 15140 `"St. Helena and Ascension"', add
label define ukctrylbl 15150 `"Senegal"', add
label define ukctrylbl 15160 `"Sierra Leone"', add
label define ukctrylbl 15170 `"Togo"', add
label define ukctrylbl 15180 `"Canary Islands"', add
label define ukctrylbl 15990 `"West Africa, n.s."', add
label define ukctrylbl 19990 `"Africa, other and n.s."', add
label define ukctrylbl 19991 `"Central and South Africa"', add
label define ukctrylbl 19992 `"East and Central Africa"', add
label define ukctrylbl 19993 `"Southeastern Africa"', add
label define ukctrylbl 19994 `"Saharan Africa"', add
label define ukctrylbl 19999 `"Africa, n.s."', add
label define ukctrylbl 20000 `"Americas"', add
label define ukctrylbl 21000 `"Caribbean"', add
label define ukctrylbl 21010 `"Anguilla"', add
label define ukctrylbl 21020 `"Antigua-Barbuda"', add
label define ukctrylbl 21030 `"Aruba"', add
label define ukctrylbl 21040 `"Bahamas"', add
label define ukctrylbl 21050 `"Barbados"', add
label define ukctrylbl 21060 `"British Virgin Islands"', add
label define ukctrylbl 21070 `"Cayman Isles"', add
label define ukctrylbl 21080 `"Cuba"', add
label define ukctrylbl 21090 `"Dominica"', add
label define ukctrylbl 21100 `"Dominican Republic"', add
label define ukctrylbl 21110 `"Grenada"', add
label define ukctrylbl 21120 `"Guadeloupe"', add
label define ukctrylbl 21130 `"Haiti"', add
label define ukctrylbl 21140 `"Jamaica"', add
label define ukctrylbl 21150 `"Martinique"', add
label define ukctrylbl 21160 `"Montserrat"', add
label define ukctrylbl 21170 `"Netherlands Antilles"', add
label define ukctrylbl 21180 `"Puerto Rico"', add
label define ukctrylbl 21190 `"St. Kitts-Nevis"', add
label define ukctrylbl 21200 `"St. Croix"', add
label define ukctrylbl 21210 `"St. John"', add
label define ukctrylbl 21220 `"St. Lucia"', add
label define ukctrylbl 21230 `"St Thomas"', add
label define ukctrylbl 21240 `"St. Vincent"', add
label define ukctrylbl 21250 `"Trinidad and Tobago"', add
label define ukctrylbl 21260 `"Turks and Caicos"', add
label define ukctrylbl 21270 `"U.S. Virgin Islands"', add
label define ukctrylbl 21990 `"Other Caribbean and n.s."', add
label define ukctrylbl 21991 `"Caribbean commonwealth, n.s."', add
label define ukctrylbl 22000 `"Central America"', add
label define ukctrylbl 22010 `"Belize/British Honduras"', add
label define ukctrylbl 22020 `"Costa Rica"', add
label define ukctrylbl 22030 `"El Salvador"', add
label define ukctrylbl 22040 `"Guatemala"', add
label define ukctrylbl 22050 `"Honduras"', add
label define ukctrylbl 22060 `"Mexico"', add
label define ukctrylbl 22070 `"Nicaragua"', add
label define ukctrylbl 22080 `"Panama"', add
label define ukctrylbl 22081 `"Panama Canal Zone"', add
label define ukctrylbl 22990 `"Central America, n.s."', add
label define ukctrylbl 22991 `"Central America and Caribbean"', add
label define ukctrylbl 23000 `"South America"', add
label define ukctrylbl 23010 `"Argentina"', add
label define ukctrylbl 23020 `"Bolivia"', add
label define ukctrylbl 23030 `"Brazil"', add
label define ukctrylbl 23040 `"Chile"', add
label define ukctrylbl 23050 `"Colombia"', add
label define ukctrylbl 23060 `"Ecuador"', add
label define ukctrylbl 23070 `"Falkland Islands"', add
label define ukctrylbl 23080 `"French Guiana"', add
label define ukctrylbl 23090 `"Guyana/British Guiana"', add
label define ukctrylbl 23100 `"Paraguay"', add
label define ukctrylbl 23110 `"Peru"', add
label define ukctrylbl 23120 `"Suriname"', add
label define ukctrylbl 23130 `"Uruguay"', add
label define ukctrylbl 23140 `"Venezuela"', add
label define ukctrylbl 23990 `"South America, other and n.s."', add
label define ukctrylbl 23991 `"South America or Central America, n.s."', add
label define ukctrylbl 23992 `"Central/South America and Caribbean"', add
label define ukctrylbl 24000 `"North America"', add
label define ukctrylbl 24010 `"Bermuda"', add
label define ukctrylbl 24020 `"Canada"', add
label define ukctrylbl 24030 `"Greenland"', add
label define ukctrylbl 24040 `"United States"', add
label define ukctrylbl 24990 `"North America, other and n.s."', add
label define ukctrylbl 24991 `"North America/Oceania"', add
label define ukctrylbl 29990 `"Americas, other and n.s."', add
label define ukctrylbl 30000 `"Asia"', add
label define ukctrylbl 31000 `"Eastern Asia"', add
label define ukctrylbl 31010 `"China"', add
label define ukctrylbl 31011 `"Hong Kong"', add
label define ukctrylbl 31012 `"Macau"', add
label define ukctrylbl 31013 `"Taiwan"', add
label define ukctrylbl 31020 `"Japan"', add
label define ukctrylbl 31030 `"Korea"', add
label define ukctrylbl 31031 `"Korea, DPR (North)"', add
label define ukctrylbl 31032 `"Korea, RO (South)"', add
label define ukctrylbl 31040 `"Mongolia"', add
label define ukctrylbl 31990 `"Eastern Asia, n.s."', add
label define ukctrylbl 32000 `"South-Central Asia"', add
label define ukctrylbl 32010 `"Afghanistan"', add
label define ukctrylbl 32020 `"Bangladesh"', add
label define ukctrylbl 32030 `"Bhutan"', add
label define ukctrylbl 32040 `"India"', add
label define ukctrylbl 32041 `"India/Pakistan"', add
label define ukctrylbl 32042 `"India/Pakistan/Bangladesh/Sri Lanka"', add
label define ukctrylbl 32050 `"Iran"', add
label define ukctrylbl 32060 `"Kazakhstan"', add
label define ukctrylbl 32070 `"Kyrgyzstan"', add
label define ukctrylbl 32080 `"Maldives"', add
label define ukctrylbl 32090 `"Nepal"', add
label define ukctrylbl 32100 `"Pakistan"', add
label define ukctrylbl 32101 `"Pakistan/Bangladesh"', add
label define ukctrylbl 32110 `"Sri Lanka (Ceylon)"', add
label define ukctrylbl 32120 `"Tajikistan"', add
label define ukctrylbl 32130 `"Turkmenistan"', add
label define ukctrylbl 32140 `"Uzbekistan"', add
label define ukctrylbl 32999 `"South-Central Asia, n.s."', add
label define ukctrylbl 33000 `"South-Eastern Asia"', add
label define ukctrylbl 33010 `"Brunei"', add
label define ukctrylbl 33020 `"Cambodia (Kampuchea)"', add
label define ukctrylbl 33030 `"East Timor"', add
label define ukctrylbl 33040 `"Indonesia"', add
label define ukctrylbl 33050 `"Laos"', add
label define ukctrylbl 33060 `"Malaysia"', add
label define ukctrylbl 33070 `"Myanmar (Burma)"', add
label define ukctrylbl 33080 `"Philippines"', add
label define ukctrylbl 33090 `"Singapore"', add
label define ukctrylbl 33100 `"Thailand"', add
label define ukctrylbl 33110 `"Vietnam"', add
label define ukctrylbl 33990 `"South-Eastern Asia, n.s."', add
label define ukctrylbl 34000 `"Western Asia"', add
label define ukctrylbl 34010 `"Armenia"', add
label define ukctrylbl 34020 `"Azerbaijan"', add
label define ukctrylbl 34030 `"Bahrain"', add
label define ukctrylbl 34040 `"Cyprus"', add
label define ukctrylbl 34050 `"Georgia"', add
label define ukctrylbl 34051 `"Abkhazia"', add
label define ukctrylbl 34052 `"South Ossetia"', add
label define ukctrylbl 34060 `"Iraq"', add
label define ukctrylbl 34070 `"Israel"', add
label define ukctrylbl 34071 `"Israel/Palestine"', add
label define ukctrylbl 34080 `"Jordan"', add
label define ukctrylbl 34090 `"Kuwait"', add
label define ukctrylbl 34100 `"Lebanon"', add
label define ukctrylbl 34110 `"Palestinian Territories"', add
label define ukctrylbl 34111 `"West Bank"', add
label define ukctrylbl 34112 `"Gaza Strip"', add
label define ukctrylbl 34120 `"Oman"', add
label define ukctrylbl 34130 `"Qatar"', add
label define ukctrylbl 34140 `"Saudi Arabia"', add
label define ukctrylbl 34150 `"Syria"', add
label define ukctrylbl 34151 `"Syria/Lebanon"', add
label define ukctrylbl 34160 `"Turkey"', add
label define ukctrylbl 34170 `"United Arab Emirates"', add
label define ukctrylbl 34180 `"Yemen"', add
label define ukctrylbl 34990 `"Western Asia, n.s."', add
label define ukctrylbl 34991 `"Middle East"', add
label define ukctrylbl 39990 `"Asia, other and n.s."', add
label define ukctrylbl 39991 `"Central Asia and Middle East, n.s."', add
label define ukctrylbl 39992 `"Far East, n.s."', add
label define ukctrylbl 39993 `"Eastern/Southeast Asia, n.s."', add
label define ukctrylbl 39994 `"Asia/Middle East, other and n.s."', add
label define ukctrylbl 39995 `"South/Southeast Asia, n.s."', add
label define ukctrylbl 40000 `"Europe"', add
label define ukctrylbl 41000 `"Eastern Europe"', add
label define ukctrylbl 41010 `"Belarus"', add
label define ukctrylbl 41020 `"Bulgaria"', add
label define ukctrylbl 41021 `"Bulgaria/Greece"', add
label define ukctrylbl 41030 `"Czech Republic/Czechoslovakia"', add
label define ukctrylbl 41040 `"Hungary"', add
label define ukctrylbl 41050 `"Poland"', add
label define ukctrylbl 41060 `"Moldova"', add
label define ukctrylbl 41070 `"Romania"', add
label define ukctrylbl 41080 `"Russia/USSR"', add
label define ukctrylbl 41090 `"Slovakia"', add
label define ukctrylbl 41100 `"Ukraine"', add
label define ukctrylbl 41990 `"Eastern Europe, other and n.s."', add
label define ukctrylbl 41991 `"Albania, Bulgaria, Czech, Hungary, Romania, Yugoslavia"', add
label define ukctrylbl 41992 `"Central-Eastern Europe"', add
label define ukctrylbl 42000 `"Northern Europe"', add
label define ukctrylbl 42010 `"Denmark"', add
label define ukctrylbl 42020 `"Estonia"', add
label define ukctrylbl 42030 `"Faroe Islands"', add
label define ukctrylbl 42040 `"Finland"', add
label define ukctrylbl 42050 `"Iceland"', add
label define ukctrylbl 42060 `"Ireland"', add
label define ukctrylbl 42070 `"Latvia"', add
label define ukctrylbl 42080 `"Lithuania"', add
label define ukctrylbl 42090 `"Norway"', add
label define ukctrylbl 42100 `"Svalbard and Jan Mayen Islands"', add
label define ukctrylbl 42110 `"Sweden"', add
label define ukctrylbl 42120 `"United Kingdom"', add
label define ukctrylbl 42990 `"Northern Europe, n.s."', add
label define ukctrylbl 43000 `"Southern Europe"', add
label define ukctrylbl 43010 `"Albania"', add
label define ukctrylbl 43020 `"Andorra"', add
label define ukctrylbl 43030 `"Bosnia and Herzegovina"', add
label define ukctrylbl 43040 `"Croatia"', add
label define ukctrylbl 43050 `"Gibraltar"', add
label define ukctrylbl 43060 `"Greece"', add
label define ukctrylbl 43070 `"Italy"', add
label define ukctrylbl 43071 `"Vatican City"', add
label define ukctrylbl 43080 `"Malta"', add
label define ukctrylbl 43090 `"Portugal"', add
label define ukctrylbl 43100 `"San Marino"', add
label define ukctrylbl 43110 `"Slovenia"', add
label define ukctrylbl 43120 `"Spain"', add
label define ukctrylbl 43121 `"Spain/Portugal"', add
label define ukctrylbl 43130 `"Macedonia"', add
label define ukctrylbl 43140 `"Yugoslavia"', add
label define ukctrylbl 43141 `"Montenegro"', add
label define ukctrylbl 43142 `"Serbia"', add
label define ukctrylbl 43143 `"Serbia and Montenegro"', add
label define ukctrylbl 43144 `"Kosovo"', add
label define ukctrylbl 43990 `"Southern Europe, n.s."', add
label define ukctrylbl 43991 `"Gibraltar/Malta"', add
label define ukctrylbl 43992 `"Portugal/Greece"', add
label define ukctrylbl 43993 `"Italy, Holy See, San Marino"', add
label define ukctrylbl 44000 `"Western Europe"', add
label define ukctrylbl 44010 `"Austria"', add
label define ukctrylbl 44020 `"Belgium"', add
label define ukctrylbl 44021 `"Belgium/Luxemburg"', add
label define ukctrylbl 44022 `"Belgium/Netherlands/Luxemburg"', add
label define ukctrylbl 44030 `"France"', add
label define ukctrylbl 44040 `"Germany"', add
label define ukctrylbl 44041 `"Germany/Austria"', add
label define ukctrylbl 44042 `"West Germany"', add
label define ukctrylbl 44043 `"Mecklenburg-Schwerin"', add
label define ukctrylbl 44050 `"Liechtenstein"', add
label define ukctrylbl 44060 `"Luxembourg"', add
label define ukctrylbl 44070 `"Monaco"', add
label define ukctrylbl 44080 `"Netherlands"', add
label define ukctrylbl 44090 `"Switzerland"', add
label define ukctrylbl 44990 `"Western Europe, n.s."', add
label define ukctrylbl 44991 `"Belgium, Denmark, Luxembourg, Netherlands"', add
label define ukctrylbl 49991 `"Turkey and U.S.S.R."', add
label define ukctrylbl 49992 `"European Union"', add
label define ukctrylbl 49993 `"European Union (original 15)"', add
label define ukctrylbl 49994 `"Other European Union (not original 15)"', add
label define ukctrylbl 49995 `"EEA, Switzerland, associated microstates"', add
label define ukctrylbl 49999 `"Europe, other and n.s."', add
label define ukctrylbl 50000 `"Oceania"', add
label define ukctrylbl 51000 `"Australia and New Zealand"', add
label define ukctrylbl 51010 `"Australia"', add
label define ukctrylbl 51020 `"New Zealand"', add
label define ukctrylbl 51030 `"Norfolk Islands"', add
label define ukctrylbl 51999 `"Australia and New Zealand, n.s."', add
label define ukctrylbl 52000 `"Melanesia"', add
label define ukctrylbl 52010 `"Fiji"', add
label define ukctrylbl 52020 `"New Caledonia"', add
label define ukctrylbl 52030 `"Papua New Guinea"', add
label define ukctrylbl 52040 `"Solomon Islands"', add
label define ukctrylbl 52050 `"Vanuatu (New Hebrides)"', add
label define ukctrylbl 52999 `"Melanesia, n.s."', add
label define ukctrylbl 53000 `"Micronesia"', add
label define ukctrylbl 53010 `"Kiribati"', add
label define ukctrylbl 53020 `"Marshall Islands"', add
label define ukctrylbl 53030 `"Nauru"', add
label define ukctrylbl 53040 `"Northern Mariana Isls."', add
label define ukctrylbl 53050 `"Palau"', add
label define ukctrylbl 53990 `"Micronesia, n.e.c."', add
label define ukctrylbl 54000 `"Polynesia"', add
label define ukctrylbl 54010 `"Cook Islands"', add
label define ukctrylbl 54020 `"French Polynesia"', add
label define ukctrylbl 54030 `"Niue"', add
label define ukctrylbl 54040 `"Pitcairn Island"', add
label define ukctrylbl 54050 `"Samoa"', add
label define ukctrylbl 54060 `"Eastern Samoa"', add
label define ukctrylbl 54070 `"Tokelau"', add
label define ukctrylbl 54080 `"Tonga"', add
label define ukctrylbl 54090 `"Tuvalu"', add
label define ukctrylbl 54100 `"Wallis and Futuna Isls."', add
label define ukctrylbl 54990 `"Polynesia, n.s."', add
label define ukctrylbl 55000 `"U.S. Pacific Possessions"', add
label define ukctrylbl 55010 `"American Samoa"', add
label define ukctrylbl 55020 `"Baker Island"', add
label define ukctrylbl 55030 `"Guam"', add
label define ukctrylbl 55040 `"Howland Island"', add
label define ukctrylbl 55050 `"Johnston Atoll"', add
label define ukctrylbl 55060 `"Kingman Reef"', add
label define ukctrylbl 55070 `"Midway Islands"', add
label define ukctrylbl 55080 `"Wake Island"', add
label define ukctrylbl 55990 `"Other US Pacific"', add
label define ukctrylbl 59990 `"Oceania, n.s."', add
label define ukctrylbl 60000 `"OTHER ABROAD"', add
label define ukctrylbl 60100 `"U.S. Outlying Areas and Territories"', add
label define ukctrylbl 60200 `"Africa/Other"', add
label define ukctrylbl 60300 `"Central/South America or Africa"', add
label define ukctrylbl 60400 `"Asia/Africa"', add
label define ukctrylbl 60500 `"Europe, Australia, New Zealand"', add
label define ukctrylbl 60600 `"Other commonwealth"', add
label define ukctrylbl 60700 `"Asia, Australia, Oceania, n.s."', add
label define ukctrylbl 69900 `"Other countries, not specified"', add
label define ukctrylbl 80000 `"AT SEA"', add
label define ukctrylbl 99999 `"Unknown"', add
label values ukctry ukctrylbl
ren ukctry ukgroups

merge m:1 ethn using ./temp/IMR-Spain.dta
keep if _m==3
drop _m ethnlbl spainctrylbl
label define spainctrylbl 26091 "Americas, n.s.", add
label define spainctrylbl 24020 "Canada", add
label define spainctrylbl 60099 "Africa, other and n.s.", add
label define spainctrylbl 22060 "Mexico", add
label define spainctrylbl 22020 "Costa Rica", add
label define spainctrylbl 22030 "El Salvador", add
label define spainctrylbl 22040 "Guatemala", add
label define spainctrylbl 22050 "Honduras", add
label define spainctrylbl 22070 "Nicaragua", add
label define spainctrylbl 22080 "Panama", add
label define spainctrylbl 21080 "Cuba", add
label define spainctrylbl 21100 "Dominican Republic", add
label define spainctrylbl 23010 "Argentina", add
label define spainctrylbl 23020 "Bolivia", add
label define spainctrylbl 23030 "Brazil", add
label define spainctrylbl 23040 "Chile", add
label define spainctrylbl 23050 "Colombia", add
label define spainctrylbl 23060 "Ecuador", add
label define spainctrylbl 23100 "Paraguay", add
label define spainctrylbl 23110 "Peru", add
label define spainctrylbl 23130 "Uruguay", add
label define spainctrylbl 23140 "Venezuela", add
label define spainctrylbl 40100 "Nordic Region", add
label define spainctrylbl 49999 "Europe, other and n.s.", add
label define spainctrylbl 42060 "Ireland", add
label define spainctrylbl 44020 "Belgium", add
label define spainctrylbl 44030 "France", add
label define spainctrylbl 44080 "Netherlands", add
label define spainctrylbl 43070 "Italy", add
label define spainctrylbl 43090 "Portugal", add
label define spainctrylbl 43120 "Spain", add
label define spainctrylbl 41020 "Bulgaria", add
label define spainctrylbl 46500 "USSR/Russia", add
label define spainctrylbl 44040 "Germany", add
label define spainctrylbl 41050 "Poland", add
label define spainctrylbl 41070 "Romania", add
label define spainctrylbl 45700 "Yugoslavia", add
label define spainctrylbl 41100 "Ukraine", add
label define spainctrylbl 34010 "Armenia", add
label define spainctrylbl 31010 "China", add
label define spainctrylbl 31013 "Taiwan", add
label define spainctrylbl 31020 "Japan", add
label define spainctrylbl 59900 "Asia, other and n.s.", add
label define spainctrylbl 33080 "Philippines", add
label define spainctrylbl 32040 "India", add
label define spainctrylbl 32020 "Bangladesh", add
label define spainctrylbl 32100 "Pakistan", add
label define spainctrylbl 54700 "Middle East", add
label define spainctrylbl 34070 "Israel", add
label define spainctrylbl 34080 "Jordan", add
label define spainctrylbl 34100 "Lebanon", add
label define spainctrylbl 34150 "Syria", add
label define spainctrylbl 34160 "Turkey", add
label define spainctrylbl 13040 "Morocco", add
label define spainctrylbl 15060 "Ghana", add
label define spainctrylbl 15130 "Nigeria", add
label define spainctrylbl 15150 "Senegal", add
label define spainctrylbl 12020 "Cameroon", add
label define spainctrylbl 51000 "Oceania", add
label values spainctry spainctrylbl
ren spainctry spaingroups

merge m:1 ethn using ./temp/wacziarggroups.dta
keep if _m==3
drop _m ethnlbl
merge m:1 ethn using ./temp/wacziargwtduscomp.dta
keep if _m==3
drop _m

* Finalize file
gen census=`x'
compress
order census ethn
sort ethn
save ./temp/`x'-Core1.dta, replace
}

* Delete files no longer needed
for any Gravity Size: erase ./temp/US-X.dta
for any 1980 1990 2000 2010 2018: erase ./temp/X-weights.dta
for any 1980 1990 2000 2010 2018: erase ./temp/X-OverageResults.dta
for any wacziarggroups wacziargwtduscomp IMR-Spain IMR-US IMR-UK: erase ./temp/X.dta
 
* End of program
log close